OPI-JSA at SemEval-2017 Task 1: Application of Ensemble learning for computing semantic textual similarity

نویسندگان

Martyna Spiewak

Piotr Sobecki

Daniel Karas

چکیده

Semantic Textual Similarity (STS) evaluation assesses the degree to which two parts of texts are similar, based on their semantic evaluation. In this paper, we describe three models submitted to STS SemEval 2017. Given two English parts of a text, each of proposed methods outputs the assessment of their semantic similarity. We propose an approach for computing monolingual semantic textual similarity based on an ensemble of three distinct methods. Our model consists of recursive neural network (RNN) text auto-encoders ensemble with supervised a model of vectorized sentences using reduced part of speech (PoS) weighted word embeddings as well as unsupervised a method based on word coverage (TakeLab). Additionally, we enrich our model with additional features that allow disambiguation of ensemble methods based on their efficiency. We have used Multi-Layer Perceptron as an ensemble classifier basing on estimations of trained Gradient Boosting Regressors. Results of our research proves that using such ensemble leads to a higher accuracy due to a fact that each memberalgorithm tends to specialize in particular type of sentences. Simple model based on PoS weighted Word2Vec word embeddings seem to improve performance of more complex RNN based auto-encoders in the ensemble. In the monolingual EnglishEnglish STS subtask our Ensemble based model achieved mean Pearson correlation of .785 compared with human annotators.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UdL at SemEval-2017 Task 1: Semantic Textual Similarity Estimation of English Sentence Pairs Using Regression Model over Pairwise Features

This paper describes the model UdL we proposed to solve the semantic textual similarity task of SemEval 2017 workshop. The track we participated in was estimating the semantics relatedness of a given set of sentence pairs in English. The best run out of three submitted runs of our model achieved a Pearson correlation score of 0.8004 compared to a hidden human annotation of 250 pairs. We used ra...

متن کامل

STS-UHH at SemEval-2017 Task 1: Scoring Semantic Textual Similarity Using Supervised and Unsupervised Ensemble

This paper reports the STS-UHH participation in the SemEval 2017 shared Task 1 of Semantic Textual Similarity (STS). Overall, we submitted 3 runs covering monolingual and cross-lingual STS tracks. Our participation involves two approaches: unsupervised approach, which estimates a word alignment-based similarity score, and supervised approach, which combines dependency graph similarity and cover...

متن کامل

MITRE at SemEval-2017 Task 1: Simple Semantic Similarity

This paper describes MITRE’s participation in the Semantic Textual Similarity task (SemEval-2017 Task 1), which evaluated machine learning approaches to the identification of similar meaning among text snippets in English, Arabic, Spanish, and Turkish. We detail the techniques we explored, ranging from simple bag-ofngrams classifiers to neural architectures with varied attention and alignment m...

متن کامل

Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity

This paper describes our proposed solutions designed for a STS core track within the SemEval 2016 English Semantic Textual Similarity (STS) task. Our method of similarity detection combines recursive autoencoders with a WordNet award-penalty system that accounts for semantic relatedness, and an SVM classifier, which produces the final score from similarity matrices. This solution is further sup...

متن کامل

NUIG-UNLP at SemEval-2016 Task 1: Soft Alignment and Deep Learning for Semantic Textual Similarity

We present a multi-feature system for computing the semantic similarity between two sentences. We introduce the use of soft alignment for computing text similarity, and also evaluate different methods to produce it. The main features used by our system are based on alignment and Explicit Semantic Analysis. Our system was above the median scores for 4 out of the 5 datasets at SemEval 2016 STS Ta...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

OPI-JSA at SemEval-2017 Task 1: Application of Ensemble learning for computing semantic textual similarity

نویسندگان

چکیده

منابع مشابه

UdL at SemEval-2017 Task 1: Semantic Textual Similarity Estimation of English Sentence Pairs Using Regression Model over Pairwise Features

STS-UHH at SemEval-2017 Task 1: Scoring Semantic Textual Similarity Using Supervised and Unsupervised Ensemble

MITRE at SemEval-2017 Task 1: Simple Semantic Similarity

Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity

NUIG-UNLP at SemEval-2016 Task 1: Soft Alignment and Deep Learning for Semantic Textual Similarity

عنوان ژورنال:

اشتراک گذاری